AITopics | adjoint operator

AT echnical Proofs Proof of Proposition 4.1.. Using the chain rule, (1), and the definitions of null

Neural Information Processing SystemsFeb-16-2026, 07:26:04 GMT

This appendix presents the technical details of efficiently implementing Algorithm 2. B.1 Computing Intermediate Quantities We argue that in the setting of neural networks, Algorithm 2 can obtain the intermediate quantities ζ Algorithm 3 gives a subroutine for computing the necessary scalars used in the efficient squared norm function of the embedding layer.Algorithm 3 Computing the Nonzero V alues of n In the former case, it is straightforward to see that we incur a compute (resp. F .1 Effect of Batch Size on Fully-Connected Layers Figure 4 presents numerical results for the same set of experiments as in Subsection 5.1 but for different batch sizes |B | instead of the output dimension q . Similar to Subsection 5.1, the results in Figure 4 are more favorable towards Adjoint compared to GhostClip.

artificial intelligence, machine learning, resp, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.35)

Add feedback

11953163dd7fb12669b41a48f78a29b6-Supplemental.pdf

Neural Information Processing SystemsFeb-7-2026, 12:46:50 GMT

adjoint operator, delta function, functional map, (13 more...)

Neural Information Processing Systems

Country: North America > Canada (0.05)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.70)

Add feedback

AT echnical Proofs Proof of Proposition 4.1.. Using the chain rule, (1), and the definitions of null

Neural Information Processing SystemsOct-9-2025, 03:35:14 GMT

This appendix presents the technical details of efficiently implementing Algorithm 2. B.1 Computing Intermediate Quantities We argue that in the setting of neural networks, Algorithm 2 can obtain the intermediate quantities ζ Algorithm 3 gives a subroutine for computing the necessary scalars used in the efficient squared norm function of the embedding layer.Algorithm 3 Computing the Nonzero V alues of n In the former case, it is straightforward to see that we incur a compute (resp. F .1 Effect of Batch Size on Fully-Connected Layers Figure 4 presents numerical results for the same set of experiments as in Subsection 5.1 but for different batch sizes |B | instead of the output dimension q . Similar to Subsection 5.1, the results in Figure 4 are more favorable towards Adjoint compared to GhostClip.

artificial intelligence, machine learning, resp, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.35)

Add feedback

11953163dd7fb12669b41a48f78a29b6-Supplemental.pdf

Neural Information Processing SystemsOct-2-2025, 02:52:43 GMT

artificial intelligence, functional map, machine learning, (15 more...)

Neural Information Processing Systems

Country: North America > Canada (0.15)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.70)

Add feedback

Importance Sampling for Nonlinear Models

Rajmohan, Prakash Palanivelu, Roosta, Fred

arXiv.org Machine LearningMay-20-2025

While norm-based and leverage-score-based methods have been extensively studied for identifying "important" data points in linear models, analogous tools for nonlinear models remain significantly underdeveloped. By introducing the concept of the adjoint operator of a nonlinear map, we address this gap and generalize norm-based and leverage-score-based importance sampling to nonlinear settings. We demonstrate that sampling based on these generalized notions of norm and leverage scores provides approximation guarantees for the underlying nonlinear mapping, similar to linear subspace embeddings. As direct applications, these nonlinear scores not only reduce the computational complexity of training nonlinear models by enabling efficient sampling over large datasets but also offer a novel mechanism for model explainability and outlier detection. Our contributions are supported by both theoretical analyses and experimental results across a variety of supervised learning scenarios.

artificial intelligence, leverage score, machine learning, (15 more...)

arXiv.org Machine Learning

2505.12353

Country:

Oceania > Australia > Queensland > Brisbane (0.04)
North America > United States > Rocky Mountains (0.04)
North America > United States > California > Santa Clara County > Stanford (0.04)
(4 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.66)

Add feedback

Accelerating Deep Unrolling Networks via Dimensionality Reduction

Tang, Junqi, Mukherjee, Subhadip, Schönlieb, Carola-Bibiane

arXiv.org Artificial IntelligenceAug-31-2022

In this work we propose a new paradigm for designing efficient deep unrolling networks using dimensionality reduction schemes, including minibatch gradient approximation and operator sketching. The deep unrolling networks are currently the state-of-the-art solutions for imaging inverse problems. However, for high-dimensional imaging tasks, especially X-ray CT and MRI imaging, the deep unrolling schemes typically become inefficient both in terms of memory and computation, due to the need of computing multiple times the high-dimensional forward and adjoint operators. Recently researchers have found that such limitations can be partially addressed by unrolling the stochastic gradient descent (SGD), inspired by the success of stochastic first-order optimization. In this work, we explore further this direction and propose first a more expressive and practical stochastic primal-dual unrolling, based on the state-of-the-art Learned Primal-Dual (LPD) network, and also a further acceleration upon stochastic primal-dual unrolling, using sketching techniques to approximate products in the high-dimensional image space. The operator sketching can be jointly applied with stochastic unrolling for the best acceleration and compression performance. Our numerical experiments on X-ray CT image reconstruction demonstrate the remarkable effectiveness of our accelerated unrolling schemes.

inverse problem, operator, reconstruction, (15 more...)

arXiv.org Artificial Intelligence

2208.14784

Country:

North America > United States > California > Los Angeles County > Long Beach (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.70)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Dimensionality Reduction (0.61)

Add feedback

Adjoint-Functions and Temporal Learning Algorithms in Neural Networks

Toomarian, N., Barhen, J.

Neural Information Processing SystemsDec-31-1991

The development of learning algorithms is generally based upon the minimization of an energy function. It is a fundamental requirement to compute the gradient of this energy function with respect to the various parameters of the neural architecture, e.g., synaptic weights, neural gain,etc. In principle, this requires solving a system of nonlinear equations for each parameter of the model, which is computationally very expensive. A new methodology for neural learning of time-dependent nonlinear mappings is presented. It exploits the concept of adjoint operators to enable a fast global computation of the network's response to perturbations in all the systems parameters. The importance of the time boundary conditions of the adjoint functions is discussed. An algorithm is presented in which the adjoint sensitivity equations are solved simultaneously (Le., forward in time) along with the nonlinear dynamics of the neural networks. This methodology makes real-time applications and hardware implementation of temporal learning feasible.

adjoint-function and temporal learning algorithm, equation, neural network, (11 more...)

Neural Information Processing Systems

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > California > Los Angeles County > Pasadena (0.04)
Asia > China (0.04)

Industry: Government > Regional Government > North America Government > United States Government (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Adjoint-Functions and Temporal Learning Algorithms in Neural Networks

Toomarian, N., Barhen, J.

Neural Information Processing SystemsDec-31-1991

The development of learning algorithms is generally based upon the minimization of an energy function. It is a fundamental requirement to compute the gradient of this energy function with respect to the various parameters of the neural architecture, e.g., synaptic weights, neural gain,etc. In principle, this requires solving a system of nonlinear equations for each parameter of the model, which is computationally very expensive. A new methodology for neural learning of time-dependent nonlinear mappings is presented. It exploits the concept of adjoint operators to enable a fast global computation of the network's response to perturbations in all the systems parameters. The importance of the time boundary conditions of the adjoint functions is discussed. An algorithm is presented in which the adjoint sensitivity equations are solved simultaneously (Le., forward in time) along with the nonlinear dynamics of the neural networks. This methodology makes real-time applications and hardware implementation of temporal learning feasible.

adjoint-function and temporal learning algorithm, equation, neural network, (11 more...)

Neural Information Processing Systems

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > California > Los Angeles County > Pasadena (0.04)
Asia > China (0.04)

Industry: Government > Regional Government > North America Government > United States Government (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Adjoint-Functions and Temporal Learning Algorithms in Neural Networks

Toomarian, N., Barhen, J.

Neural Information Processing SystemsDec-31-1991

The development of learning algorithms is generally based upon the minimization ofan energy function. It is a fundamental requirement to compute the gradient of this energy function with respect to the various parameters ofthe neural architecture, e.g., synaptic weights, neural gain,etc. In principle, this requires solving a system of nonlinear equations for each parameter of the model, which is computationally very expensive. A new methodology for neural learning of time-dependent nonlinear mappings is presented. It exploits the concept of adjoint operators to enable a fast global computation of the network's response to perturbations in all the systems parameters. The importance of the time boundary conditions of the adjoint functions is discussed. An algorithm is presented in which the adjoint sensitivity equations are solved simultaneously (Le., forward in time) along with the nonlinear dynamics of the neural networks. This methodology makes real-time applications and hardware implementation of temporal learning feasible.

adjoint-function and temporal learning algorithm, equation, neural network, (11 more...)

Neural Information Processing Systems

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > California > Los Angeles County > Pasadena (0.04)
Asia > China (0.04)

Industry: Government > Regional Government > North America Government > United States Government (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Adjoint Operator Algorithms for Faster Learning in Dynamical Neural Networks

Barhen, Jacob, Toomarian, Nikzad Benny, Gulati, Sandeep

Neural Information Processing SystemsDec-31-1990

A methodology for faster supervised learning in dynamical nonlinear neural networks is presented. It exploits the concept of adjoint operntors to enable computation of changes in the network's response due to perturbations in all system parameters, using the solution of a single set of appropriately constructed linear equations. The lower bound on speedup per learning iteration over conventional methods for calculating the neuromorphic energy gradient is O(N2), where N is the number of neurons in the network. 1 INTRODUCTION The biggest promise of artifcial neural networks as computational tools lies in the hope that they will enable fast processing and synthesis of complex information patterns. In particular, considerable efforts have recently been devoted to the formulation of efficent methodologies for learning (e.g., Rumelhart et al., 1986; Pineda, 1988; Pearlmutter, 1989; Williams and Zipser, 1989; Barhen, Gulati and Zak, 1989). The development of learning algorithms is generally based upon the minimization of a neuromorphic energy function. The fundamental requirement of such an approach is the computation of the gradient of this objective function with respect to the various parameters of the neural architecture, e.g., synaptic weights, neural Adjoint Operator Algorithms 499

barhen, equation, operator, (13 more...)

Neural Information Processing Systems

Country: